122 research outputs found
Author Identifiers in Scholarly Repositories
Bibliometric and usage-based analyses and tools highlight the value of
information about scholarship contained within the network of authors, articles
and usage data. Less progress has been made on populating and using the author
side of this network than the article side, in part because of the difficulty
of unambiguously identifying authors. I briefly review a sample of author
identifier schemes, and consider use in scholarly repositories. I then describe
preliminary work at arXiv to implement public author identifiers, services
based on them, and plans to make this information useful beyond the boundaries
of arXiv.Comment: 10 pages. Based on a presentation given at Open Repositories 200
Eprints and the Open Archives Initiative
The Open Archives Initiative (OAI) was created as a practical way to promote
interoperability between eprint repositories. Although the scope of the OAI has
been broadened, eprint repositories still represent a significant fraction of
OAI data providers. In this article I present a brief survey of OAI eprint
repositories, and of services using metadata harvested from eprint repositories
using the OAI protocol for metadata harvesting (OAI-PMH). I then discuss
several situations where metadata harvesting may be used to further improve the
utility of eprint archives as a component of the scholarly communication
infrastructure.Comment: 13 page
Exposing and harvesting metadata using the OAI metadata harvesting protocol: A tutorial
In this article I outline the ideas behind the Open Archives Initiative
metadata harvesting protocol (OAIMH), and attempt to clarify some common
misconceptions. I then consider how the OAIMH protocol can be used to expose
and harvest metadata. Perl code examples are given as practical illustration.Comment: 13 pages, 1 figure. Example programs included (download source).
HEPLW version (HTML) available online at
http://library.cern.ch/HEPLW/4/papers/3
Author identifiers: 1) Services at arXiv and 2) ORCID and repositories
I will present two separate but related topics where experience with the first provides much of my perspective with the second.
Public author identifiers and services based on them were introduced in March 2009 and early work and design was reported at OR09. The original services have been running for a year now and additional facilities have been added. I will report and uptake and usage patterns, and describe the more popular services.
ORCID is an exciting initiative involving both commercial and academic participants that aims to build a registry and assign identifiers to address the author ambiguity problem. I will report on the current status of this rapidly evolving project and suggest how the repository community may contribute to and benefit from it
Author identifiers: 1) Services at arXiv and 2) ORCID and repositories
I will present two separate but related topics where experience with the first provides much of my perspective with the second.
Public author identifiers and services based on them were introduced in March 2009 and early work and design was reported at OR09. The original services have been running for a year now and additional facilities have been added. I will report and uptake and usage patterns, and describe the more popular services.
ORCID is an exciting initiative involving both commercial and academic participants that aims to build a registry and assign identifiers to address the author ambiguity problem. I will report on the current status of this rapidly evolving project and suggest how the repository community may contribute to and benefit from it
Plagiarism Detection in arXiv
We describe a large-scale application of methods for finding plagiarism in
research document collections. The methods are applied to a collection of
284,834 documents collected by arXiv.org over a 14 year period, covering a few
different research disciplines. The methodology efficiently detects a variety
of problematic author behaviors, and heuristics are developed to reduce the
number of false positives. The methods are also efficient enough to implement
as a real-time submission screen for a collection many times larger.Comment: Sixth International Conference on Data Mining (ICDM'06), Dec 200
- …